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ABSTRACT 


In  this  project,  we  developed  high-level,  human-like  specification  languages  for  robotic 
motion  tasks  and  computational  frameworks  for  automatic  synthesis  of  robot  control  and 
communication  strategies  from  such  specifications.  Central  to  our  approach  are  finite 
abstractions,  which  allow  for  the  use  of  (adapted)  temporal  logics  as  specification 
languages,  tools  from  formal  verification  resembling  model  checking  for  analysis  and 
control,  and  techniques  inspired  from  synchronization  in  concurrency  theory  for  synthesis 
of  communication  strategies.  In  short,  the  results  of  this  project  can  be  grouped  into:  (1) 
computational  tools  for  abstraction  and  temporal  logic  analysis  and  control  of  one 
dynamical  system  and  (2)  computational  frameworks  for  automatic  synthesis  of  control 
and  communication  strategies  for  a  robotic  team  from  global  specifications  given  as 
temporal  logic  statements  over  a  set  of  environmental  service  requests. 


REPORT 

The  results  of  this  3-year  project  include:  (1)  Computational  frameworks  for  abstraction 
and  Linear  Temporal  Logic  (LTL)  verification  and  control  of  discrete-time  linear 
systems;  (2)  A  computational  framework  for  LTL  control  of  continuous-time  multi-affine 
dynamics  with  time  constraints;  (3)  Tools  for  Markov  Decision  Process  (MDP)  control 
from  specifications  given  as  Probabilistic  Computation  Tree  Logic  (PCTL)  formulas;  (4) 
Tools  for  Markov  Decision  Process  (MDP)  control  from  specifications  given  as 
Probabilistic  LTL  (PLTL)  formulas;  (5)  Tools  for  continuous-time  Markov  Decision 
Process  (CTMDP)  control  from  specifications  given  as  Continuous  Stochastic  Logic 
(CSL)  formulas;  (6)  Computational  frameworks  for  automatic  synthesis  of  (optimal) 
control  and  communication  strategies  for  robotic  teams  from  global  specifications  given 
as  Regular  Expressions  (RE)  and  LTL  fonnulas  over  sets  of  environmental  service 
requests;  (7)  Computational  tool  for  LTL  robot  control  based  on  automata  learning  of 
environmental  dynamics.  For  all  of  the  above,  we  performed  experimental  vali  dations. 

In  the  following,  we  provide  more  details  on  these  results.  The  corresponding  papers  are 
cited  by  first  author,  conference  or  journal  abbreviation,  and  year  of  publication. 

(1)  Abstraction  and  LTL  verification  and  control  of  discrete-time  linear  systems 

As  part  of  the  first  objective  of  the  initial  proposal,  in  [Yordanov  IEEE  TAC  2012, 
Tumova  CDC  2010]  we  developed  a  computational  framework  for  automatic  synthesis  of 
a  feedback  control  strategy  for  a  piecewise  affine  (PWA)  system  from  a  specification 
given  as  a  Linear  Temporal  Logic  (LTL)  fonnula  over  an  arbitrary  set  of  linear  predicates 
in  its  state  variables.  Our  approach  consists  of  two  main  steps.  First,  by  defining 
appropriate  partitions  for  its  state  and  input  spaces,  we  construct  a  finite  abstraction  of  the 
PWA  system  in  the  fonn  of  a  control  transition  system.  Second,  by  leveraging  ideas  and 
techniques  from  Rabin  games  and  LTL  model  checking,  we  develop  an  algorithm  to 
generate  a  control  strategy  for  the  finite  abstraction.  While  provably  correct  and  robust  to 


small  perturbations  in  both  state  measurements  and  applied  inputs,  the  overall  procedure 
is  conservative  and  expensive. 

In  [Yordanov  IEEE  TAG  2010,  Yordanov  GDC  2010],  we  developed  a  computational 
framework  for  identifying  a  set  of  initial  states  from  which  all  trajectories  of  a  piecewise 
affine  (PWA)  system  satisfy  a  Linear  Temporal  Logic  (LTL)  formula  over  a  set  of  linear 
predicates  in  its  state  variables.  Our  approach  is  based  on  the  construction  and  refinement 
of  finite  abstractions  of  infinite  systems  (i.e.  systems  where  states  can  take  infinitely 
many  values).  We  derive  conditions  guaranteeing  the  equivalence  of  an  infinite  system 
and  its  finite  abstraction  with  respect  to  a  specific  temporal  logic  formula  and  propose 
methods  aimed  at  the  construction  of  such  formula-equivalent  abstractions.  We  show  that 
the  proposed  procedure  can  be  implemented  using  polyhedral  operations  and  analysis  of 
finite  graphs. 

Both  tools  described  above  were  implemented  as  user-friendly  packages  available  for 
download  at  our  website  hyness.bu.edu/software.  It  is  important  to  note  that  PWA 
systems  are  quite  general,  since  they  can  approximate  nonlinear  dynamics  (such  as 
aircraft  dynamics)  with  arbitrary  accuracy.  In  addition,  there  exist  several  computation 
tools  for  the  identification  of  such  models  from  input-output  experimental  data. 

(2)  A  computational  framework  for  LTL  control  of  continuous-time  multi-affine 
dynamics  with  time  constraints 

In  [Aydin-Gol  ADHS  2012],  we  considered  the  problem  of  controlling  a  dynamical 
system  such  that  its  trajectories  satisfy  a  temporal  logic  property  in  a  given  amount  of 
time.  We  focused  on  multi-affine  systems  and  specifications  given  as  syntactically  co¬ 
safe  linear  temporal  logic  formulas  over  rectangular  regions  in  the  state  space.  The 
proposed  algorithm  is  based  on  the  estimation  of  time  bounds  for  facet  reachability 
problems  and  solving  a  time  optimal  reachability  problem  on  the  product  between  a 
weighted  transition  system  and  an  automaton  that  enforces  the  satisfaction  of  the 
specification.  A  random  optimization  algorithm  was  used  to  iteratively  improve  the 
solution. 

(3)  MDP  Control  from  PCTL  Specifications  with  Applications  to  Robot  Motion 
Planning  and  Control 

In  [Lahijanian  ICRA  2010,  Lahijanian  ACC  2011,  Lahijanian  TRO  201 1],  we  considered 
a  robot  that  moves  in  a  partitioned  environment  by  applying  a  given  set  of  motion 
primitives  allowing  it  to  steer  between  adjacent  regions.  Due  to  sensor  and  actuation 
noise,  while  applying  an  available  motion  primitive  at  a  region,  the  robot  can  transit  to 
more  than  one  adjacent  regions.  We  assume  that  the  probabilities  of  these  transitions  are 
known.  We  also  assumed  that  the  robot  could  determine  its  current  region  precisely.  In 
indoor  environments,  such  an  assumption  is  not  overly  restrictive  since  simple 
environment  modifications  can  be  made  to  enforce  it,  such  as  placing  a  large  number  of 
radio  frequency  identity  (RFID)  tags.  We  solved  this  problem  by  abstracting  the  robot 
motion  to  an  MDP  and  deriving  an  algorithm  to  synthesize  control  strategies  from 
specifications  given  as  PCTL  formulas.  Examples  of  such  specifications  include 


"Eventually  reach  A  and  then  B  with  probability  greater  than  0.9  while  always  avoiding 
the  regions  from  which  the  probabilities  of  converging  to  C  is  greater  than  0.2”  and 
"Eventually  reach  a  region  from  which  the  probability  of  converging  to  A  is  0.8  while 
minimizing  the  total  amount  of  travel  time”. 

In  short,  given  a  specification  as  a  PCTL  formula,  the  algorithm  returns  the  maximum 
probability  or  the  minimum  cost  of  satisfaction  and  a  control  strategy  that  achieves  this 
probability  or  cost.  Our  algorithm  uses  sub-algorithms  corresponding  to  each  temporal 
operator  as  building  blocks  for  the  construction  of  a  control  strategy  from  a  formula  with 
multiple  temporal  operators.  The  most  computationally  expensive  sub-algorithm  requires 
solving  a  linear  programming  problem. 

We  illustrated  the  method  in  our  Robotic  InDoor  Environment,  in  which  an  iRobot 
iCreate  platform  equipped  with  RFID  readers  and  a  laser  range  finder  moves 
autonomously  through  the  corridors  and  intersections  of  an  easily  reconfigurable 
environment. 

In  [Cizelj  IF  AC  2011],  we  developed  an  approach  to  design  a  reactive  control  strategy 
that  provides  probabilistic  guarantees  of  accomplishing  the  mission  in  a  threat-rich 
environment.  This  control  strategy  is  reactive  in  the  sense  that  the  control  of  the  vehicle  is 
updated  whenever  the  vehicle  reaches  a  new  region  in  the  environment,  or  an  adversary 
moves  in  between  the  current  region  and  its  adjacent  region  (i.e.  if  the  vehicle  observes 
movements  of  adversaries,  it  updates  the  adversary  distributions  for  adjacent  regions  and 
chooses  a  different  control  action  as  needed).  In  order  to  solve  this  problem,  we  capture 
the  motion  of  the  vehicle,  as  well  as  vehicle  estimates  of  the  adversary  distributions  in  a 
MDP.  This  way,  we  map  the  vehicle  control  problem  to  the  problem  of  finding  a  control 
policy  for  an  MDP  such  that  the  probability  of  satisfying  a  PCTL  formula  is  maximized. 
For  the  latter,  we  used  the  MDP  control  approach  described  above. 

In  [Medina  IROS  2011],  we  consider  an  extension  of  the  PCTL  control  problem 
described  above  by  allowing  for  a  dynamic  (changing)  environment.  Specifically,  the 
environment  includes  doors  that  open  and  close  during  the  robot's  mission.  We  solve  this 
problem  under  three  settings  that  assume  different  levels  of  knowledge  and  sensing 
capabilities  of  the  robot.  In  the  first  setting  the  robot  is  given  \a  priori  infonnation  about 
the  states  of  the  doors,  but  it  can  only  learn  their  true  states  in  a  region  adjacent  to  them. 
The  second  setting  excludes  the  prior  information  about  the  states  of  the  doors  but  retains 
the  assumption  that  the  exact  state  of  any  door  is  known  when  the  robot  is  in  a  region 
adjacent  to  it.  In  the  last  setting,  this  assumption  is  also  relaxed  and  we  allow  for  possibly 
erroneous  measurements  as  to  the  state  of  the  doors  observed  by  the  robot.  A  Markov 
decision  process  (MDP)  is  used  to  model  the  system  under  the  first  setting.  The  second 
and  third  settings  are  cast  as  mixed  observable  Markov  decision  processes  (MOMDPs). 
We  consider  specifications  given  as  PCTL  formulas  and  develop  a  framework  for  the 
synthesis  of  control  strategies  from  such  specifications.  While  this  paper  focuses  for 
clarity  on  an  indoor  environment  with  doors,  the  problem  and  methods  developed  can 
easily  be  generalized  to  arbitrary  environments  with  regions  in  which  the  transitions  can 
be  open  or  blocked.  To  illustrate  these  methods,  we  used  the  Robotic  InDoor 


Environment  (RIDE)  Simulator  to  generate  the  MDP  and  MOMDP  models  for  a  robot 
moving  in  a  dynamic  environment  and  to  show  the  planning  of  the  robot. 

(4)  MDP  Control  from  Specifications  given  as  PLTL  Formulas  with  Applications  to 
Robot  Motion  Planning  and  Control 

In  [Ding  IF  AC  2011],  we  presented  a  method  to  generate  a  robot  control  strategy  that 
maximizes  the  probability  to  accomplish  a  task.  The  task  is  given  as  a  Linear  Temporal 
Logic  (LTL)  formula  over  a  set  of  properties  that  can  be  satisfied  at  the  regions  of  a 
partitioned  environment.  We  assume  that  the  probabilities  with  which  the  properties  are 
satisfied  at  the  regions  are  known,  and  the  robot  can  detennine  the  truth  value  of  a 
proposition  only  at  the  current  region.  Motivated  by  several  results  on  partitioned-based 
abstractions,  we  assume  that  the  motion  is  perfonned  on  a  graph.  To  account  for  noisy 
sensors  and  actuators,  we  assume  that  a  control  action  enables  several  transitions  with 
known  probabilities.  We  show  that  this  problem  can  be  reduced  to  the  problem  of 
generating  a  control  policy  for  a  Markov  Decision  Process  (MDP)  such  that  the 
probability  of  satisfying  an  LTL  formula  over  its  states  is  maximized.  We  provide  a 
complete  solution  for  the  latter  problem  that  builds  on  existing  results  from  probabilistic 
model  checking. 

In  [Ding  CDC  2011],  we  developed  a  method  to  automatically  generate  an  optimal 
control  policy  for  a  dynamical  system  modeled  as  a  Markov  Decision  Process  (MDP). 
The  control  specification  is  given  as  a  Linear  Temporal  Logic  (LTL)  formula  over  a  set 
of  propositions  defined  on  the  states  of  the  MDP.  We  used  the  method  described  above 
synthesize  a  control  policy  such  that  the  MDP  satisfies  the  given  specification  almost 
surely,  if  such  a  policy  exists.  In  addition,  we  designate  an  "optimizing  proposition”  to  be 
repeatedly  satisfied,  and  we  formulate  a  novel  optimization  criterion  in  terms  of 
minimizing  the  expected  cost  in  between  satisfactions  of  this  proposition.  We  propose  a 
sufficient  condition  for  a  policy  to  be  optimal,  and  develop  a  dynamic  programming 
algorithm  that  synthesizes  a  policy  that  is  optimal  under  some  conditions,  and  sub- 
optimal  otherwise. 

(5)  Continuous-Time  MDP  Control  from  Specifications  given  as  CSL  formulas 

In  [Medina,  ICRA  2012],  we  considered  a  mobile  robot  whose  performance  is  measured 
by  the  completion  of  temporal  logic  tasks  within  a  certain  period  of  time.  In  addition  to 
such  time  constraints,  the  planning  algorithm  must  also  deal  with  changes  in  the  robot’s 
workspace  during  task  execution.  In  our  case,  the  robot  was  deployed  in  a  partitioned 
environment  subjected  to  structural  changes  in  which  doors  shift  from  open  to  closed  and 
vice-versa.  The  motion  of  the  robot  was  modeled  as  a  Continuous  Time  Markov  Decision 
Process  and  the  robot’s  mission  was  expressed  as  a  Continuous  Stochastic  Logic  (CSL) 
temporal  logic  specification.  An  approximate  solution  to  find  a  control  strategy  that 
satisfies  such  specifications  was  derived  for  a  subset  of  probabilistic  CSL  formulae. 
Simulation  and  experimental  results  were  provided  to  illustrate  the  method. 


(6)  Automatic  synthesis  of  (optimal)  control  and  communication  strategies  for 
robotic  teams  from  global  specifications  given  as  Regular  Expressions  (RE)  and 
LTL  formulas  over  sets  of  environmental  service  requests 

In  [Chen  TRO  2012,  Chen  DARS  2010,  Chen  IROS  2010],  we  presented  a  computational 
framework  for  automatic  synthesis  of  control  and  communication  strategies  for  a  robotic 
team  from  task  specifications  given  as  regular  expressions  about  servicing  requests  in  an 
environment.  We  assumed  that  the  location  of  the  requests  in  the  environment  and  the 
robot  capacities  and  cooperation  requirements  to  service  the  requests  were  known.  Our 
approach  was  based  on  two  main  ideas.  First,  we  extended  recent  results  from  formal 
synthesis  of  distributed  systems  to  check  for  the  distributability  of  the  task  specification 
and  to  generate  local  specifications,  while  accounting  for  the  service  and  communication 
capabilities  of  the  robots.  Second,  by  using  a  technique  inspired  from  LTL  model 
checking,  we  generated  individual  control  and  communication  strategies.  We  illustrated 
the  method  with  experimental  results  in  our  Robotic  Urban-Like  Environment  (RULE). 
In  [Chen  CDC  2011],  we  extended  this  technique  to  specifications  given  in  full  LTL. 
Experimental  validations  were  included  as  well. 

In  [Ulusoy  IROS  2011],  we  presented  a  method  for  automatically  planning  optimal  paths 
for  a  group  of  robots  that  satisfy  a  common  high  level  mission  specification.  Each  robot’s 
motion  in  the  environment  is  modeled  as  a  weighted  transition  system.  The  mission  for 
the  robots  is  given  as  a  general  Linear  Temporal  Logic  formula.  In  addition,  an 
optimizing  proposition  must  repeatedly  be  satisfied  by  the  group  of  robots.  The  goal  for 
the  group  is  to  minimize  the  maximum  time  between  satisfying  instances  of  the 
optimizing  proposition.  Our  method  is  guaranteed  to  compute  an  optimal  set  of  robot 
paths.  We  utilized  a  timed  automaton  representation  in  order  to  capture  the  relative 
position  of  the  robots  in  the  environment.  We  then  obtained  a  bisimulation  of  this  timed 
automaton  as  a  relatively  compact  finite  transition  system  that  captured  the  joint  behavior 
of  the  robots.  We  then  applied  our  earlier  algorithm  for  the  single  robot  case  to  optimize 
the  group  motion.  We  presented  a  simulation  of  robots  performing  a  persistent 
monitoring  task  in  a  road  network  environment,  and  characterized  the  complexity  of  our 
method. 

(7)  LTL  Robot  Control  based  on  Automata  Learning  of  Environmental  Dynamics 

In  [Chen  ICRA  2012],  we  developed  a  technique  to  automatically  generate  a  control 
policy  for  a  robot  moving  in  an  environment  that  includes  elements  with  partially 
unknown,  changing  behavior.  The  robot  was  required  to  achieve  an  optimal  surveillance 
mission,  in  which  a  certain  request  needed  to  be  serviced  repeatedly,  while  the  expected 
time  in  between  consecutive  services  was  minimized.  We  defined  a  fragment  of  LTL  to 
describe  such  a  mission  and  formulated  the  problem  as  a  temporal  logic  game.  Our 
approach  was  based  on  two  main  ideas.  First,  we  extended  results  in  automata  learning  to 
detect  patterns  of  the  partially  unknown  behavior  of  the  elements  in  the  environment. 
Second,  we  employed  an  automata-theoretic  method  to  generate  the  control  policy.  We 
showed  that  the  obtained  control  policy  converges  to  an  optimal  one  when  the  unknown 
behavior  patterns  are  fully  learned.  We  also  implemented  the  proposed  framework  in 


MATLAB  and  used  in  conjunction  with  a  simulator  to  demonstrate  the  behavior  of  a 
robot  performing  missions  in  the  environment. 
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